Lei Shi, Institute of Software, Chinese
Academy of Sciences, shijim@gmail.com PRIMARY
Qi Liao, Central Michigan University, qi.liao@cmich.edu
Chunxin Yang, Northwestern
Polytechnical University, chunxin11@163.com
Student Team: No
Video:
Answers to Mini-Challenge 2 Questions: MC 2.1 Using your visual analytics tools, can you identify
what noteworthy events took place for the time period covered in the firewall
and IDS logs? Provide screen shots of your visual analytics tools that
highlight the five most noteworthy events of security concern, along with
explanations of each event. The firewall/IDS logs in
the regional Bank of Money (BoM) network are processed into traffic flow graphs
and per-host anomaly list (See Data
Processing part for details), and then visualized in NSAV tool (See Visualization part for details). All the 40-hour data are
loaded into our tool to obtain an abstracted overview of network traffic (see graph abstraction part for details)
within BoM during the inspected time period, as shown in Figure 1(a). We then
manually drag-and-drop the machine groups into larger groups by machine types,
and generate Figure 1(b), indicating three major traffic types in BoM:
“Workstation Group I <--> Headquarter DCs & Website Group I”,
“Workstation Group II & External DNS <--> Regional Domain/DNS
server”, “Workstation Group III <--> Website Group II”. (a) (b) Figure
1. Overview pictures of the traffic flow graphs in BoM office: (a) after the
abstraction. (b) after a further manual grouping. We then combine the
anomaly lists (see Figure 12) detected in the
data pre-processing section to the graph links. Result is shown in Figure 2. Figure
2. NSAV tool visualizing both the traffic graph abstraction and the anomalies
on the graph links (flows). The grouped node (10.32.5.58+) indicating two
similar machines (10.32.5.58, 10.32.5.59) is selected in the graph. Their
temporal anomaly distributions are plotted in the bottom-right panel. Details
about each single anomaly is shown as tooltip upon a mouse hovering. We identify the noteworthy events from the anomaly graph in a
divide-and-conquer method over each of the isolated subgraphs (connected
components). In the first
subgraph of the traffic network, as in Figure 3, it is identified that the “I”
and “M” icons appeared frequently and almost in couples in reverse directions.
A selection of the “IRC-Malware-Infection_s” anomaly (icon “I”) in the anomaly
type list reveals three group of machines, highlighted in red in the graph. Are
all workstations attacking and sending malwares to a portion of the 12 websites
(10.32.5.*), highlighted in blue in the graph. Further selecting two typical
attackers (172.23.123.105, 172.23.231.174) and websites (10.32.5.50,
10.32.5.52) in the graph filter panel, the temporal anomaly distribution of
these four machines are plotted in the temporal anomaly panel. It is shown that
the attacks to the websites overwhelm the whole inspected time period. Note
that a few websites (10.32.5.50, 68, 69) do not reply with the IRC
authorization message (icon “M”), but all the other websites (10.32.5.51-59) do
reply, indicating that they are running IRC services and vulnerable to the IRC
attacks. Figure 3. Three group of
machines initiating IRC Malware Infections to the websites through port 6667. A detailed examination on host 172.23.231.174 and 172.23.231.175
(two all-time attackers) show fine-grained patterns (Figure 4): the attacks are
composed of two stages, indicated by a small gap in the middle of 172.23.231.174’s
temporal panel. A drill-down analysis on 172.23.231.175 at this gap shows that
the first stage ends-up with a very large port (43325) and the second stage
starts with a relatively small port (1185). After checking the anomaly file of
172.23.231.175 (top-left of Figure 4), we deduce that the attacks from the
workstations are probably programmed, with sequentially enumerated source ports
from the compromised systems. It can be classified as DoS attacks to exhaust
the websites’ processing and networking bandwidth. Figure 4. Detailed
inspection of 172.23.231.174, 172.23.231.175 on their temporal anomaly
distributions. Two stage of source-port-sweep IRC attacks are identified with
programmed behaviors. In the same
subgraph, we also found anomalies on the workstations indicating FTP/SSH
connections to the websites (Figure 5). The connection attempts concentrate on
10.32.5.50-57, indicated by the grey version of icon “C” and “S”. We split the
potential source into sub-groups by anomaly types, and select the destination
machine group of 10.32.5.50-57. The temporal anomaly panel demonstrates that
the first stage of the FTP/SSH connection lies mostly in the first 6 hours,
either in parallel or preceding to the IRC attacks. In the second stage,
synchronized to the second stage of the IRC attacks, only FTP connections are
tried. In both stages, no reply from the websites is recorded. This behavior
may suggest that the compromised systems (workstations) probe FTP/SSH services
at the websites, probably for succeeding DoS attacks. However, no further
events happen in this thread, since no FTP/SSH services are hosted in the
websites. Figure 5. FTP/SSH connection attempts to websites
10.32.5.50-57. The related workstation machines are grouped both by the
neighbor set and the node anomaly types. In the same
subgraph, another group of 5 workstation machines have more than FTP/SSH
connections, identified by “A”, “T” and “M” icons. We manually group them together
and check their temporal anomalies. The extra connections mostly happen in the
starting period of the inspected time. Details of the anomaly description
indicate that the connection attempts are potential scans over database
(PostgreSQL/Oracle/MySQL), remote desktop (VNC), mail (Pop3, IMAP) and other
(SNMP) services. The destination IP, 172.23.0.1, is the external interface at
the firewall going out of the regional network. We cannot know which hosts
outside the regional network are scanned. However, we know that none of these
connections succeed, because no reverse traffic is detected. Figure 6. Database/remote
desktop/mail service connection attempts to 172.23.0.1. The related 5 workstation
machines are grouped together due to the same set of anomaly types. Another connected
component of the traffic graph, as shown in Figure 7, is centric to the
Domain/DNS server machine 172.23.0.10. Totally 89 workstations send suspicious
traffic to the server, indicated by two type of anomalies. The temporal anomaly
panel of Figure 7 shows the details on the server (172.23.0.10) and two typical
workstation machines (172.23.1.104, 172.23.1.105). The “P” icon indicates DNS
updates to the server, suspected to be DNS hijacking/spoofing attacks, because
the workstations should not have responsibility to update the DNS table. The
“G” icon indicates Generic-Protocol-Command-Decode. It has two sub-classes by
description, the asn1 buffer overflow attempt and the IPC$ share access
attempts. Both are exploiting the system vulnerability and resource of the
server. A drill-down analysis on 172.23.1.105 with an enlarged window shows
that the vulnerability/resource exploits have regular patterns, one per 15
minutes from each workstation at sequentially enumerated ports, highly
suspected to be programmed attacks from the compromised systems. Figure 7. The traffic flow
graph centric to the regional domain/DNS server(172.23.0.10). Potential DNS
hijacking/spoofing events and vulnerability exploits are detected from a group
of 89 workstations to the server. In the last
connected component, the traffic are mostly web page retrievals from 2700+
workstations at 16 websites, as well as the financial transactions and web mail
access with the headquarter data center. 30 unknown hosts (172.28.29.*) are
identified, in a different Class B network with the regional BoM machines.
These suspicious hosts have two-way connections (attempts) with both the
websites and headquarter data centers. Note that the inbound traffic from the
websites and headquarter data centers to these hosts are all denied at the
firewall. However, this is significantly different from the other workstations
where no inbound traffic or connection attempt is made, highly suspected to be
source IP spoofing event. Figure 8. Traffic graph
for website visit and headquarter data center access. 30 unknown workstation
hosts are identified. MC
2.2 What security trend is apparent in
the firewall and IDS logs over the course of the two days included here?
Illustrate the identified trend with an informative and innovative
visualization. On the IRC malware infection events given in Figure 3 and 4, at the
end of the inspected time period, the second cycle of the attacks are about to
finish. However, it is believed that a third and more cycles of the same type
of attacks will be repeated unless measures are taken to mitigate the event.
Moreover, comparing the traffic graph of the first 10 hours and the last 10
hours, as shown in Figure 9, it is straightforward to find that the first
period has less workstation machines (4+253+57 = 314) involved in the
suspicious attacks, while the second period has more probably compromised
workstation machines (230+318 = 548). (a) (b) Figure 9. Comparison of
the IRC Malware Infection related traffic during the inspected time period: (a)
in the first 10 hour; (b) in the last 10 hour. Figure 10 shows the comparison of DNS server related traffic graphs.
The two periods remain almost the same. We can deduce from the previous
analysis in Figure 7 that the vulnerability exploits at the DNS server will
continue, because the current trial rate is one per 15 minute from the
compromised workstation machine, only reaching port 25xx at each host by the
end of the entire period. This security event will at least last tens of times
of the inspected period. (a) (b) Figure 10. Comparison of
the Domain/DNS server-centric traffic graphs: (a) in the first 10 hour; (b) in
the last 10 hour. MC
2.3 What do you suspect is (are) the
root cause(s) of the events identified in MC 2.1? Understanding that you cannot shut down the
corporate network or disconnect it from the internet, what actions should the
network administrators take to mitigate the root cause problem(s)? The root
cause of the security events are probably the workstation systems (172.23.*.*)
became comprised due to weak passwords and vulnerabilities on the system and
therefore became members of botnets. Malwares running on these workstations continue
to scan and exploit security holes of other workstations, servers and external
websites. The compromised machines (bots) can also initiate DDoS attacks to the
Domain/DNS servers and external websites. The possible
mitigations are: 1) Cut down
the command and control (C&C) channels for the botmaster. The enterprise
network administrator should block all IRC related traffic involving port 6667.
2) Change
the passwords immediately. The first step for many attackers to compromise a
system is simply to try to connect with SSH and guess passwords. The
administrator should adopt strong password policy for users and set up password
expiration and rotation policy. 3) Shutdown
the unnecessary services at the regional network, such as FTP , Remote Desktop,
IRC, SSH, Database, etc. Reconfigure firewall rules so that for the traffic
going to the unused service port will be dropped at the firewall in the first
place. 4) Run
vulnerability scans to identify vulnerability of systems, then either patch the
systems to the latest standard to prevent from security threats. Run virus
scanning program to remove the viruses. 5) Secure
the servers in the regional network, e.g., configure the DNS server to refuse
all DNS update from the illegitimate DNS servers. Use public-key authentication
for DNS updates. While these
suggestions may be helpful, there is no panacea to cure all security problems.
The proposed Network Security and Anomaly Visualization (NSAV) tool can provide
a time-efficient alternative to network operators and administrators, such as
those at Bank of Money, to not only detect but more importantly find the root
causes of network security anomalies, if they happen. The data processing takes two steps: We extract the potential anomalies from the firewall and IDS logs. A
common format is defined for all type of anomalies: <Timestamp>, <Host Machine IP
(:port)>, <Anomaly Type>, <Detailed Description> To parse firewall logs, we take a white list approach. We manually
write a good rule set as in Figure 11, according to the manual interpretation
of the BoM network operation policies and configurations. This takes us
approximately 2 hour in total, including the initial setting and the iterative
changes to the rule set. Figure 11. Firewall good
rule set: The resulting firewall anomalies are the traffic not matched by all the
rules. Each flow will generate a “_s” anomaly in the source machine and a “_d”
anomaly in the destination machine. The firewall anomalies are further
partitioned into 5 types, as shown in Figure 12. A sample of the firewall
anomalies is given below. Firewall anomaly example: 1333789124,172.23.254.80,IRC-Malware-Infection_s,172.23.254.80:2275
--> 10.32.5.50:6667 For the IDS logs, all the records are kept as anomalies, because IDS
already did the filtering process. A sample can be found below. 6 IDS anomaly
types are present in the data. IDS log anomaly sample: 1333737240,172.23.254.80,Misc-activity_d,ET
POLICY IRC authorization message 10.32.5.55: 6667-->172.23.254.80:1534 The traffic flow graphs indicating the live
network topology are constructed directly from the firewall NetFlow data, where
each source IP address and port number has established connection states with a
destination IP and port. For concise purpose, only IP level connections are
used as network edges. Time is partitioned by a preset window size, 3600s by
default. Each flow will be recorded in consecutive time windows according to
their built and tear down timestamps. Eventually one flow graph is generated
for each time window for flexibility. During the online visualization, the user
can select several consecutive time slots and the corresponding graphs are aggregated
on the fly. Note that caching mechanism is applied to speedup the processing. All the graph visualizations shown in this submission applies the
loss-free graph abstraction method. The basic idea is to group nodes with the
same neighbor set together as mega-nodes. The node and edge attributes of the
mega-node are aggregated from the underlying original nodes. In most cases, the
graph abstraction can reduce the graph complexity (measured by #nodes) by
>95%, in this case 99.5%. It is guaranteed that the abstracted graph
preserves many critical features of the original graph: connectivity, shortest
path, node affinity, and importantly all the connections (flow in the security graph).
The graph abstraction algorithm is deterministic, single-pass, and scalable to
support graphs of a million nodes. For more details of the graph abstraction
method, the reader can refer to the technical report here: CNG Report
(unpublished manuscript, all right
preserved). In the bottom right panel of the NSAV tool, a temporal visualization
is shown to present the distribution of anomalies on the selected hosts. The
horizontal axis is the time dimension, the vertical axis is the type of
anomalies, one row per anomaly type. Each anomaly is plotted as one icon, with
the icon alphabet indicating the anomaly type (Figure 12). The view can be
zoomed both vertically and horizontally. A mouse hovering on each icon can show
the details of the anomaly.
1.
IRC Malware Infection: from
172.23.231-240.*, 123-136.*, 0.*, 1.*, 252.*, 254.* (totally 580 workstations)
to 10.32.5.50-59, 68-69 (12 websites)
2.
Failed FTP/SSH connection: from
172.23.231-240.*, 123-136.*, 0.*, 1.*, 252.*, 254.* (totally 580 workstations)
to 10.32.5.50-59, 68-69 (12 websites)
3.
Failed database/remote desktop/mail
services connection: from 172.23.231-240.* (5 workstations) to 172.23.0.1
(external regional network interface)
4.
Potential DNS hijacking/spoofing and
vulnerability exploits over the DNS server: from 172.23.0.*, 1.*, 5.* (89
workstations) to 172.23.0.10 (Domain/DNS controller)
5.
Unknown hosts: 172.28.29.* (30
workstations)
APPENDIX
l Data
Processing Details
1.
Anomaly parsing
Figure 12. Types of network
anomalies detected in BoM regional office.
2.
Traffic flow graph generation
l
Visualization Design
1.
Huge Graph Visualization through
Loss-Free Abstraction
2.
Temporal Anomaly Visualization